-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolves #272 #275
Merged
Merged
resolves #272 #275
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
IanGrimstead
approved these changes
May 20, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
thanasions
pushed a commit
that referenced
this pull request
Sep 25, 2019
* argschecker updated #178 * Reverted to latest pdmarima (#212) * Removed erroneous 2nd arima fit (#212) arima fits on construction, don't need to explicitly call 'fit' * Removed erroneous 2nd arima fit (#212) arima fits on construction, don't need to explicitly call 'fit' * Reverted to original test specification (#212) * Reverted to original test specification (#212) * Added version debug code (#212) As requested on pmdarima bug reporting page * Added version debug code (#212) As requested on pmdarima bug reporting page * Report warnings and errors (#212) May have accidentely been surpressing errors - that could be reporting why the test fails * Report warnings and errors (#212) May have accidentely been surpressing errors - that could be reporting why the test fails * resolves #217 * resolves #217 * test * test * test * test * test * test * changed the tests with real data to check if random numbers were comfusing the models, hence the big discrepancies * changed the tests with real data to check if random numbers were comfusing the models, hence the big discrepancies * Updated Arima to use pmdarima rather than pyramid-arima (#212) * Updated Arima to use pmdarima rather than pyramid-arima (#212) * Test pmdarima 1.0.0 to test windows (#212) Seeing if an earlier version of pmdarima works in windows * Test pmdarima 1.0.0 to test windows (#212) Seeing if an earlier version of pmdarima works in windows * emtech report to file! * emtech report to file! * Reverted to latest pdmarima (#212) * Reverted to latest pdmarima (#212) * Removed erroneous 2nd arima fit (#212) arima fits on construction, don't need to explicitly call 'fit' * Removed erroneous 2nd arima fit (#212) arima fits on construction, don't need to explicitly call 'fit' * Reverted to original test specification (#212) * Reverted to original test specification (#212) * Added version debug code (#212) As requested on pmdarima bug reporting page * Added version debug code (#212) As requested on pmdarima bug reporting page * Report warnings and errors (#212) May have accidentely been surpressing errors - that could be reporting why the test fails * Report warnings and errors (#212) May have accidentely been surpressing errors - that could be reporting why the test fails * analyzer ngrams processing was not stopping unigrams :) * analyzer ngrams processing was not stopping unigrams :) * adjusted tests to reflect bug fixes in stoplists processing * adjusted tests to reflect bug fixes in stoplists processing * added a check on the returned tuple for stopwords. That will enable users to optimize list without having to re-compute tf-idf * added a check on the returned tuple for stopwords. That will enable users to optimize list without having to re-compute tf-idf * pmdarima>=110 * pmdarima>=110 * added a check on the returned tuple for stopwords. That will enable users to optimize list without having to re-compute tf-idf * added a check on the returned tuple for stopwords. That will enable users to optimize list without having to re-compute tf-idf * rid of vectorizer. Only vocabulary needed * rid of vectorizer. Only vocabulary needed * 225 ridof pmdarima (#226) * rid of vectorizer. Only vocabulary needed * rid of pmd. Also realized that two of our test series were identical. No need to test them twice :) * pmd left. * just to check why one excepts and other doesn't * rid of vectorizer. Only vocabulary needed * scipy was the proble, in the end. Has to be >=1.2.1 * 225 ridof pmdarima (#226) * rid of vectorizer. Only vocabulary needed * rid of pmd. Also realized that two of our test series were identical. No need to test them twice :) * pmd left. * just to check why one excepts and other doesn't * rid of vectorizer. Only vocabulary needed * scipy was the proble, in the end. Has to be >=1.2.1 * 223 pipeline bug (#224) * rid of vectorizer. Only vocabulary needed * pickle-depickle tfidf test now represents different executions (#223) WordAnalyser reset between calls to main() - will catch if stopwords etc not populated * 223 pipeline bug (#224) * rid of vectorizer. Only vocabulary needed * pickle-depickle tfidf test now represents different executions (#223) WordAnalyser reset between calls to main() - will catch if stopwords etc not populated * Travis now reports python packages in use Added `pip freeze` to travis.yml * Travis now reports python packages in use Added `pip freeze` to travis.yml * Corrected pip listing of packages * Corrected pip listing of packages * 228 data path (#229) * Removed override to 'data' path and added date info #228 Now reports date range of patents in use * Removed 2nd construction of WordAnalyser #228 * 228 data path (#229) * Removed override to 'data' path and added date info #228 Now reports date range of patents in use * Removed 2nd construction of WordAnalyser #228 * 230 arima failing (#231) * Alternative method to annoy ARIMA #230 * 230 arima failing (#231) * Alternative method to annoy ARIMA #230 * 227 bug csv date (#233) * Testing python 3.7.3 via pip and *correctly* switch to Xenial linux (#227) * Checks if DF date column is a string and converts to datetime #227 * Oops. Test failing as date_column not always corrected to datetime #227 * 227 bug csv date (#233) * Testing python 3.7.3 via pip and *correctly* switch to Xenial linux (#227) * Checks if DF date column is a string and converts to datetime #227 * Oops. Test failing as date_column not always corrected to datetime #227 * csv dates come as strings. Type-check to see what's going on and conv… (#232) * moved things around a bit. type check after df creaation inside not read from pickle clause. If read from pickle, that should have been taken care of.. * csv dates come as strings. Type-check to see what's going on and conv… (#232) * moved things around a bit. type check after df creaation inside not read from pickle clause. If read from pickle, that should have been taken care of.. * Remove leading zero trimming (#235) (#239) * Remove leading zero trimming (#235) (#239) * added argument for embeddings threshold * added argument for embeddings threshold * resolves #250 (#251) * scipy==1.2.1 else breaks * new gensim breaks windows! Force 3.4.0 * resolves #250 (#251) * scipy==1.2.1 else breaks * new gensim breaks windows! Force 3.4.0 * filtering rows now gets rid of corresponding rows in df (#249) * filtering rows now gets rid of corresponding rows in df * gensim & scipy version limited due to introduced instability in current versions * filtering rows now gets rid of corresponding rows in df (#249) * filtering rows now gets rid of corresponding rows in df * gensim & scipy version limited due to introduced instability in current versions * Update pygrams.py Co-Authored-By: emily-tew <38726410+emily-tew@users.noreply.github.com> * Update pygrams.py Co-Authored-By: emily-tew <38726410+emily-tew@users.noreply.github.com> * 248 tfidf filter (#254) * Added prefilter of terms (#248) * 248 tfidf filter (#254) * Added prefilter of terms (#248) * del * del * Update README.md Missing `.` on `pip install -e .` * Corrected check for empty CPC list (#261) * cache 2 initial commit! (#269) * cache 2 initial commit! * fix-imports was calling the properties nd populating tfidf_mat. Disabled it. Plus some cosmetics * helper function to safeguard from None idf or tfidf * 257 add nmf code (#271) Added NMF output * resolves #272 (#275) * Dictionary used to store CPC rather than list inside data frame * 273 dates as ints (#277) * Dates now pickled as integer array to save space (#273) Tidied up date related utilities - added to date_utils from utils Renamed 'iso dates' to 'year_week' dates to avoid confusion with 'real' iso Column filter removed from DocumentsFilter Removed time and CPC document weighting * Update README.md * 279 small adjustments (#280) * Dates now pickled as integer array to save space (#273) * Tidied up date related utilities - added to date_utils from utils * Renamed 'iso dates' to 'year_week' dates to avoid confusion with 'real' iso * Column filter removed from DocumentsFilter * Removed time and CPC document weighting * Removed unused parameters and synchronised variable names (#273) * Added timing report and progress reports * 278 move mask (#283) * resolves #278 * Changed folders for cached outputs (#281) (#284) * 285 data uspto (#286) * error checks change... * resolves #286 * 287 update system requirements section (#288) * Updated System Performance section (System Requirements) * minor mods * Small bug (#289) * threshold not a list * save time series to file (#270) * Update README.md -it option was outdated * 291 bug (#292) resolves #291 * 294 fb (#295) * resolves #294 * 296 emtech facelift (#297) * resolves #296 * 298 nltk installation (#299) NLTK data now downloaded during execution of `pip install` (fixes #298) * 256 tech report 2 (#301) resolves # 256 * Ch comments (#304) * ch comments * Checking changes were propagated correctly #256 (#305) * Checking changes were propagated correctly #256 * Checking changes were propagated correctly - more missing #256 * Few american spellings caught #256 * Exponential emergence (#306) * add exponential emergence * #255 convert r scripts (#308) state space model resolves #308 #255 * General facelift * Refactoring for readability * Corrected issue with calculation of Porter (was using head not tail of dataset) * State space (#317) * cache state-space data! * two-stage grid search * Corrected test with duplicated args (good spot...) Now copes if min/max time series dates are not defined * If smoothing not requested, ensure None is returned for smoothed dictionary * Default predictor set now excludes LSTMs * 319 cache (#320) * #319 updated code and tests to reflect new cache usage * 321 test stopwords (#322) * #321 added stopwords to test folder for test specific variant * #319 consistent cmd line args, GloVe can now be placed anywhere * 315 clamp redo (#323) * #315 clamp smoothed values at 0 * cast smoothed data back to lists (from numpy arrays) for consistency * command line args now restricted to available smoothing and emergence * added simple test for holt-winters to confirm -ve values not handled * 326 mpq (#327) * mpq tweak and cached data * #328 added tests for example command line (#329) * #328 added tests for example command line * fixed: date not defined when not required causes failure * #328 corrected execution folder for README tests * Corrected merge * Whitespace changes ready for merge to master * Cleanup state space modelling * Whups. Now checks tests again and only runs on travis... and not win32 * Whups. Now checks tests again and only runs on travis... and not win32 * 324 state space predictions (#325) * #324 create table from state space results - work in progress * tests TBA * first commit * #324 create table from state space results - with tests * Trimmed SD not implemented * #324 Trimmed SD implemented * #324 report window size to HTML * #324 WIP - needs refinement, but works for non-test. Test may blow graph generation. * #328 multiplot added as option * Cleanup state space modelling * Whups. Now checks tests again and only runs on travis... and not win32 * Whups. Now checks tests again and only runs on travis... and not win32 * Merge issue with SSM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.